APUNet: Revitalizing GPU as Packet Processing Accelerator

نویسندگان

  • Younghwan Go
  • Muhammad Asim Jamshed
  • YoungGyoun Moon
  • Changho Hwang
  • KyoungSoo Park
چکیده

Many research works have recently experimented with GPU to accelerate packet processing in network applications. Most works have shown that GPU brings a significant performance boost when it is compared to the CPUonly approach, thanks to its highly-parallel computation capacity and large memory bandwidth. However, a recent work argues that for many applications, the key enabler for high performance is the inherent feature of GPU that automatically hides memory access latency rather than its parallel computation power. It also claims that CPU can outperform or achieve a similar performance as GPU if its code is re-arranged to run concurrently with memory access, employing optimization techniques such as group prefetching and software pipelining. In this paper, we revisit the claim of the work and see if it can be generalized to a large class of network applications. Our findings with eight popular algorithms widely used in network applications show that (a) there are many compute-bound algorithms that do benefit from the parallel computation capacity of GPU while CPU-based optimizations fail to help, and (b) the relative performance advantage of CPU over GPU in most applications is due to data transfer bottleneck in PCIe communication of discrete GPU rather than lack of capacity of GPU itself. To avoid the PCIe bottleneck, we suggest employing integrated GPU in recent APU platforms as a cost-effective packet processing accelerator. We address a number of practical issues in fully exploiting the capacity of APU and show that network applications based on APU achieve multi-10 Gbps performance for many compute/memory-intensive algorithms.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Evaluation iterative solver for pCDR on GPU accelerator

In the past few years, the graphics processing units (GPU) has become trend in high performance computing (HPC). The newest Top500 list was showed three supercomputers contain GPU accelerator on Top10 in Nov. 2010. The role of the GPU accelerator has become more and more important for scientific computing and computational fluid dynamic (CFD) to obtain result quickly and efficiently. The GPU ha...

متن کامل

Improvement and parallelization of Snort network intrusion detection mechanism using graphics processing unit

Nowadays, Network Intrusion Detection Systems (NIDS) are widely used to provide full security on computer networks. IDS are categorized into two primary types, including signature-based systems and anomaly-based systems. The former is more commonly used than the latter due to its lower error rate. The core of a signature-based IDS is the pattern matching. This process is inherently a computatio...

متن کامل

SWM: Simplified Wu-Manber for GPU-based Deep Packet Inspection

Graphics processing units (GPU) have potential to speed up deep packet inspection (DPI) by processing many packets in parallel. However, popular methods of DPI such as deterministic finite automata are limited because they are single stride. Alternatively, the complexity of multiple stride methods is not appropriate for the SIMD operation of a GPU. In this work we present SWM, a simplified, mul...

متن کامل

Solving Hyperbolic PDEs using Accelerator Architectures

This thesis investigates using highly parallel accelerator architectures to accelerate the simulation of nonlinear hyperbolic PDEs. Second-order accurate finite volume flux-limiter methods are implemented for the shallow water equations using cartesian grids and explicit time-stepping. The solver is implemented for three different architectures, a multicore x86 CPU using threading, IBM’s Cell P...

متن کامل

A 7.663-TOPS 8.2-W Energy-efficient FPGA Accelerator for Binary Convolutional Neural Networks

FPGA-based hardware accelerators for convolutional neural networks (CNNs) have obtained great attentions due to their higher energy efficiency than GPUs. However, it is challenging for FPGA-based solutions to achieve a higher throughput than GPU counterparts. In this paper, we demonstrate that FPGA acceleration can be a superior solution in terms of both throughput and energy efficiency when a ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2017